Chinese Spelling Error Detection Using a Fusion Lattice LSTM

نویسندگان

چکیده

Spelling error detection serves as a crucial preprocessing in many natural language processing applications. Unlike English, where every single word is directly typed by keyboard, we have to use an input method Chinese characters. The pinyin the most widely used. By intuition, should be helpful detecting spelling errors. However, when detect errors, of current methods ignore information and adopt pipeline framework that leads propagation. In this article, propose fusion lattice-LSTM model under end-to-end integrate character, word, features for detection. Experiments on SIGHAN Bake-off-2015 dataset show discriminating feature, our outperforms baseline models obviously.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Analysis of Sindhi Spelling Error Patterns for Spelling Error Detection and Correction

Statistical analysis of spelling error trends in a language plays important role in automatic spelling error detection and correction. Comprehensive statistical analysis of spelling error trends for Sindhi is still subject of research. This research study identifies and analyses the spelling error trends in Sindhi. The statistical analysis of error trends is based on a real time corpus collecte...

متن کامل

Bi-LSTM Neural Networks for Chinese Grammatical Error Diagnosis

Grammatical Error Diagnosis for Chinese has always been a challenge for both foreign learners and NLP researchers, for the variousity of grammar and the flexibility of expression. In this paper, we present a model based on Bidirectional Long Short-Term Memory(Bi-LSTM) neural networks, which treats the task as a sequence labeling problem, so as to detect Chinese grammatical errors, to identify t...

متن کامل

Disfluency Detection Using a Bidirectional LSTM

We introduce a new approach for disfluency detection using a Bidirectional Long-Short Term Memory neural network (BLSTM). In addition to the word sequence, the model takes as input pattern match features that were developed to reduce sensitivity to vocabulary size in training, which lead to improved performance over the word sequence alone. The BLSTM takes advantage of explicit repair states in...

متن کامل

A Survey of Spelling Error Detection and Correction Techniques

Spelling Correction is a process of detecting and sometimes providing suggestions for incorrectly spelled words in a text. Spell Checker is an application program that flags words in a document that may not be spelled correctly. Spell Checker may be stand-alone capable of operating on a block a text such as word processor, electronic dictionary. When some text is given as an input to spell chec...

متن کامل

Word Vector/Conditional Random Field-based Chinese Spelling Error Detection for SIGHAN-2015 Evaluation

In order to detect Chinese spelling errors, especially for essays written by foreign learners, a word vector/conditional random field (CRF)based detector is proposed in this paper. The main idea is to project each word in a test sentence into a high dimensional vector space in order to reveal and examine their relationships by using a CRF. The results are then utilized to constrain the time-con...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: ACM Transactions on Asian and Low-Resource Language Information Processing

سال: 2021

ISSN: ['2375-4699', '2375-4702']

DOI: https://doi.org/10.1145/3426882